Multi-band and multi-cue analyses of disordered connected speech
نویسندگان
چکیده
The objective is to analyze vocal dysperiodicities in connected speech produced by dysphonic speakers. The analysis involves a speech variogram-based method that enables tracking instantaneous vocal dysperiodicities. The dysperiodicity trace is summarized by means of the signal-todysperiodicity ratio, which has been shown to correlate strongly with the perceived degree of hoarseness of the speaker. Previously, this method has been evaluated on small corpora. In the study that is reported here the corpus has comprised 28 normophonic and 223 dysphonic speakers. This has enabled carrying out the analysis in multiple frequency bands and submitting the signal-to-dysperiodicity ratios per band to multi-variable linear regression analysis with a view to predicting the perceptual ratings of the disordered speech fragments. The analysis results are compared to the cepstral peak prominence, which is a cue that indirectly summarizes vocal dysperiodicities frame-wise via the size of the first rhamonic of the speech cepstrum. Results show that the signal-to-dysperiodicity ratios obtained for low-frequency bands up to 1500 Hz contribute most to the prediction of the perceptual scores. Also, combining the cepstral peak prominence with the low frequency-band signal-todysperiodicity ratio increases their common correlation with perceptual scores to 0.8.
منابع مشابه
Multi-band dysperiodicity analyses of disordered connected speech
The objective is to analyze vocal dysperiodicities in connected speech produced by dysphonic speakers. The analysis involves a variogram-based method that enables tracking instantaneous vocal dysperiodicities. The dysperiodicity trace is summarized by means of the signal-to-dysperiodicity ratio, which has been shown to correlate strongly with the perceived degree of hoarseness of the speaker. P...
متن کاملMulti-band Segmental Signal-to-dysperiodicity Ratios in Connected Speech Produced by Normophonic and Dysphonic Speakers
The objective is to analyze vocal dysperiodicities in connected speech produced by dysphonic speakers. The analysis involves a variogram-based method that enables tracking instantaneous vocal dysperiodicities. The dysperiodicity trace is summarized by means of the signal-to-dysperiodicity ratio, which has been shown to correlate strongly with the perceived degree of hoarseness of the speaker. P...
متن کاملA weight estimation method using LDA for multi-band speech recognition
This paper proposes a band-weight estimation method using Linear Discriminant Analysis (LDA) for multi-band automatic speech recognition (ASR). In our scheme, a spectral domain feature, SPEC, is modeled using a multi-stream HMM technique. This paper also proposes the use of Output Likelihood Normalization (OLN) in combination with the LDA-based weight-estimation method in order to adjust the re...
متن کاملAsynchrony with trained transition probabilities improves performance in multi-band speech recognition
One of the central themes in multi-band automatic speech recognition (ASR) is to devise a strategy for recombining sub-band information. This in turn raises two questions: (1) at what phonetic unit should the recombination take place? (2) How asynchronously should the sub-bands be run? Theoretically asynchronous multi-band ASR should perform at least as well as synchronous multi-band ASR. Howev...
متن کاملExact multi-electronic electron-concentration dependent ground-states for disordered two-dimensional two-band systems in presence of disordered hoppings and finite on-site random interactions
We report exact multielectronic ground-states dependent on electron concentration for quantum mechanical two-dimensional disordered two-band type many body models in the presence of disordered hoppings and disordered repulsive finite Hubbard interactions, in fixed lattice topology considered provided by Bravais lattices. The obtained ground-states loose their eigenfunction character for indepen...
متن کامل